home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
CD ROM Paradise Collection 4
/
CD ROM Paradise Collection 4 1995 Nov.iso
/
filutil
/
pa_st270.zip
/
POM.DOC
< prev
next >
Wrap
Text File
|
1995-02-01
|
54KB
|
1,225 lines
===========================================================================
===========================================================================
============================ ============================
============================ ============================
============================ PARSE-O-MATIC ============================
============================ ============================
============================ ============================
===========================================================================
===========================================================================
---------------------------------------------------------------------------
| |
| HERE ARE A FEW OF THE THINGS PARSE-O-MATIC CAN DO FOR YOU: |
| |
| Importing Exporting Automated Editing |
| Text Extraction Data Conversion Table Lookup |
| Retabulation Info Weeding Selective Copying |
| Binary-File to Text Report Reformatting Wide-Text Folding |
| Auto-Batch Creation Comm-log Trimming Tab Replacement |
| Character Filtering Column Switching DBF Interpretation |
| |
| "Parse-O-Matic is a wonderful time saver .... Each report that |
| I can convert from our ... accounting system saves our company |
| about 500 man hours per year" -- R. Brooker (a happy POM user) |
| |
---------------------------------------------------------------------------
Parse-O-Matic is Copyright (C) 1992, 1995 by:
Pinnacle Software, CP 386 Mount Royal, Quebec, Canada H3P 3C6
U.S. Office: Box 714 Airport Road, Swanton, Vermont 05488 USA
Support Line (514) 345-9578 --- Free Files BBS (514) 345-8654
---------------------------------------------------------------------------
| |
| |
| This is a SHAREWARE product. That means we would like you to |
| pass around unregistered copies to other people. If you have |
| a modem, you can upload a copy to a local BBS, or give a copy |
| to a friend who will save time and effort with Parse-O-Matic. |
| |
| |
---------------------------------------------------------------------------
===========================================================================
OVERVIEW
===========================================================================
INTRODUCTION
------------
Why you need Parse-O-Matic -- an example
Parse-O-Matic to the rescue!
How it works
FUNDAMENTALS
------------
The Parse-O-Matic command
The POM file
Padding for clarity
COMMAND WORDS
-------------
The SET command
The IF command
The BEGIN and END commands
The OUT and OUTEND commands
The OUTHDG and PAGELEN commands
The MINLEN command
The IGNORE command
The ACCEPT command
The TRIM command
The PAD command
The INSERT command
The CHANGE command
The SPLIT command
The CHOP command
The LOOKUP command
The LOOKFILE command
The LOOKCOLS command
The LOOKSPEC command
The TRACE command
TERMS AND TECHNIQUES
--------------------
Values
Delimiters
Illegal characters
Incrementing
Line counters
Tracing
Quiet mode
DBF Files
Converting comma-delimited files
Examples
===========================================================================
INTRODUCTION
===========================================================================
Parse-O-Matic is a programmable file-parser. Simple enough for even a non-
programmer to master, it can help out in countless ways. Here are some of
the things Parse-O-Matic can do: Importing, Exporting, Automated Editing,
Text Extraction, Data Conversion, Table Lookup, Retabulation, Info Weeding,
Selective Copying, Binary-File to Text, Tab Replacement, Reformatting,
Wide-Text Folding, Auto-Batch Creation, Character Filtering, Column
Switching, DBF Interpretation, Report Generation, and more!
----------------------------------------
WHY YOU NEED PARSE-O-MATIC -- AN EXAMPLE
----------------------------------------
There are plenty of programs out there that have valuable data locked away
inside them. How do you get that data OUT of one program and into another
one?
Some programs provide a feature which "exports" a file into some kind of
generic format. Perhaps the most popular of these formats is known as a
"comma-delimited file", which is a text file in which each data field is
separated by a comma. Literal strings -- which might themselves contain
commas -- are surrounded by double quotes. So a few lines from a
comma-delimited file might look something like this (an export from a
hypothetical database of people who owe your company money):
+-------------------------------------------------------------------------+
| "JONES","FRED","1234 GREEN AVENUE", "KANSAS CITY", "MO",293.64 |
| "SMITH","JOHN","2343 OAK STREET","NEW YORK","NY",22.50 |
| "WILLIAMS","JOSEPH","23 GARDEN CRESCENT","TORONTO","ON",16.99 |
+-------------------------------------------------------------------------+
Unfortunately, not all programs export or import data in this format.
Even more frustrating is a program that exports data in a format that is
ALMOST what you need!
If that's the case, you might decide to spend a few hours in a text editor,
modifying the export file so that the other program can understand it. Or
you might write a program to do the editing for you. Both solutions are
time-consuming.
An even more challenging problem arises when a program which has no export
capability does have the ability to "print" reports to a file. You can
write a program to read these files and convert them to something you can
use, but this can be a LOT of work!
----------------------------
PARSE-O-MATIC TO THE RESCUE!
----------------------------
Parse-O-Matic is a utility that reads text, fixed-length and DBF (DBase)
files, interprets the data, and outputs the result to a text file. It can
help you "boil down" reports into their essential data. You can also use
it to convert NEARLY compatible import files, or generate printable
reports.
------------
HOW IT WORKS
------------
You need three things:
1) The Parse-O-Matic program
2) A Parse-O-Matic "POM" file (to tell Parse-O-Matic what to do)
3) The input file
The input file is usually a report from another program, a fixed record
length data file, or a DBF (DBase) file. We've provided several examples
of typical input files. For example, the file EXAMPLE2.TXT comes from the
AccPac accounting software. AccPac is a great program, but its export
capabilities leave something to be desired. Parse-O-Matic can help!
To see detailed demonstrations of how these files can be parsed, enter
START at the DOS prompt, then select EXAMPLES.
===========================================================================
FUNDAMENTALS
===========================================================================
This documentation assumes that you are an experienced computer user. If
you have trouble, you might ask a programmer to help you -- POM file
creation is a little like programming!
-------------------------
THE PARSE-O-MATIC COMMAND
-------------------------
The basic format of the Parse-O-Matic command line is:
POM pom-file input-file output-file
Here is an example, as you would type it at the DOS command line:
POM POMFILE.POM REPORT.TXT OUTPUT.TXT
For a more formal description of the command line, start up POM by typing
this command at the DOS prompt:
POM
------------
THE POM FILE
------------
The POM file is a text file with a .POM extension. The following
conventions are used when interpreting the POM file:
- Null lines and lines starting with a semi-colon (comments) are ignored.
- A POM file may contain up to 500 lines of specifications.
Comment lines do not count in this total.
A POM file contains no "loops" (to use the programming term). Each line of
the input file is processed by the entire POM file. If you'd like it
expressed in terms of programming languages, here's what POM does:
+-------------------------------------------------------------------------+
| START: If there's nothing left in the input file, go to QUIT. |
| Read a line from the input file |
| Do everything in the POM file |
| Go to START |
| QUIT: Tell the user you're finished! |
+-------------------------------------------------------------------------+
-------------------
PADDING FOR CLARITY
-------------------
Spaces and tabs between the words and variables in a POM file line are
generally ignored (except in the case of the OUT and OUTEND commands). You
can use spaces to make your POM files easier to read.
Additionally, in any line in the POM file, the following terms are ignored:
= THEN ELSE
These can be added to make the lines easier to read. For example, the IF
command can be written in any of the following ways:
Very terse: IF PRICE "0.00" BONUS "0.00" "1.00"
Padded with spaces: IF PRICE "0.00" BONUS "0.00" "1.00"
Fully padded: IF PRICE = "0.00" THEN BONUS = "0.00" ELSE "1.00"
===========================================================================
COMMAND WORDS
===========================================================================
For ease of learning, the commands words are explained in the following
order:
+-------------------------------------------------------------------------+
| |
| COMMANDS WHICH WILL... LIST OF COMMANDS |
| ---------------------------------- --------------------------------- |
| Break up an input line into fields SET IF |
| Control processing flow BEGIN END |
| Generate or control output OUT OUTEND OUTHDG PAGELEN |
| Accept or reject input MINLEN IGNORE ACCEPT |
| Alter fields TRIM PAD INSERT CHANGE |
| Preprocess input SPLIT CHOP |
| Look up data in another file LOOKUP LOOKFILE LOOKCOLS LOOKSPEC |
| Trace processing TRACE |
| |
+-------------------------------------------------------------------------+
Here is a quick-reference table of all the commands. The following conven-
tions are used in the table:
"var" means a variable that is being set.
"value" means a variable whose value is being read.
Square brackets [like this] indicate optional items.
------------------------------------------- ------------------------------
COMMAND FORMATS EXAMPLE
=========================================== ==============================
SET var1 value1 SET NAME $FLINE[20 26]
IF value1 value2 var1 value3 [value4] IF X = "Y" THEN Z = "N"
------------------------------------------- ------------------------------
BEGIN value1 value2 BEGIN LINECNTR = "3"
END END
------------------------------------------- ------------------------------
OUT [value1 value2] |output-picture OUT "X" "X" |{PRICE}
OUTEND [value1 value2] |output-picture OUTEND "X" "X" |{$FLINE}
OUTHDG value1 OUTHDG "LIST OF EMPLOYEES"
PAGELEN value1 [value2] PAGELEN "66" "N"
------------------------------------------- ------------------------------
MINLEN value1 MINLEN "15"
IGNORE value1 value2 IGNORE PRICE "0.00"
ACCEPT value1 value2 ACCEPT $FLINE[1 3] "YES"
------------------------------------------- ------------------------------
TRIM var1 spec1 character TRIM PRICE "R" "$"
PAD var1 spec1 character len PAD SERIALNUM "L" "0" "10"
INSERT var1 spec1 value1 INSERT PRICE "L" "$"
CHANGE var1 value1 value2 CHANGE DATE "/" "-"
------------------------------------------- ------------------------------
SPLIT from to [,from to] [...] SPLIT 1 250, 251 300
CHOP from to [,from to] [...] CHOP 1 250, 251 300
------------------------------------------- ------------------------------
LOOKUP var1 value1 LOOKUP PHONENUM "FRED JONES"
LOOKFILE value1 LOOKFILE "C:\TABLES\DATA.TBL"
LOOKCOLS value1 value2 value3 value4 LOOKCOLS "1" "3" "8" "255"
LOOKSPEC value1 value2 value3 LOOKSPEC "Y" "N" "N"
------------------------------------------- ------------------------------
TRACE var1 TRACE PRICE
------------------------------------------- ------------------------------
The commands are explained in more detail (and in the same order) in the
following sections.
---------------
The SET Command
---------------
FORMAT: SET var1 value1
SET assigns a value to a variable. The usual reason to do this is to set a
variable from the input line (represented by the variable $FLINE) prior to
cleaning it up with TRIM. For example, if the input line looked like this:
JOHN SMITH 555-1234 322 Westchester Lane Architect
| | | | |
Column 1 Col 12 Col 22 Col 33 Col 57
then we could extract the last name from the input line with these two POM
commands:
SET NAME = $FLINE[12 21] (Sets the variable from the input line)
TRIM NAME "R" " " (Trims any spaces on the right side)
SET would first set the variable NAME to this value: "SMITH "
After the TRIM, the variable NAME would have the value: "SMITH"
You will also use SET if you plan to include a substring of $FLINE in the
output, since the OUT and OUTEND commands do not recognize substrings after
the "|" marker, only complete variables.
--------------
The IF Command
--------------
FORMAT: IF value1 value2 var1 value3 [value4]
If value1 contains value2, var1 is set to value3. Otherwise, it is set to
value4. If value4 is missing, nothing is done (i.e. var1 is not changed).
Here's an example of the IF command...
SET EARNING = $FLINE[20 26]
TRIM EARNING "A" " "
IF EARNING = "0.00" THEN BONUS = "0.00" ELSE "1.00"
This would obtain the value between columns 20 and 26, remove any spaces,
then check if it equals "0.00". If it does, the variable BONUS is set to
0.00. If not, BONUS is set to "1.00".
--------------------------
The BEGIN and END Commands
--------------------------
The format for the BEGIN and END commands is as follows:
BEGIN value1 value2
:
Dependant code
:
END
If value1 equals value2, then the dependant code (the POM lines between
the BEGIN and the END) are executed. If value1 does not equal value2,
then the dependant code is skipped.
It is traditional in programming to indent code that appears in blocks
such as Parse-O-Matic's BEGIN/END technique. This makes the logic of
the program easier to understand. For example:
BEGIN datatype = "Employee"
SET phone = $FLINE[ 1 10]
SET address = $FLINE[12 31]
END
BEGIN/END blocks can be nested. That is to say, you can have BEGIN/END
blocks inside other BEGIN/END blocks. Here is an example, with arrows
to indicate the levels of each BEGIN/END block...
BEGIN datatype = "Employee" <---------------------
SET phone = $FLINE[ 1 10] |
SET address = $FLINE[12 31] |
SET areacode = phone[1 3] | First
BEGIN areacode = "514" <------- Second | Level
SET local = "Y" | Level | Block
SET tax = "Y" <------- Block |
END |
END <---------------------
In this case, the "inner" block (starting with BEGIN areacode = "514")
would only be reached if the "outer" block (BEGIN datatype = "Employee")
was true. If the outer block was false, the inner block would never be
executed.
A nested BEGIN/END block must always be completely inside the outer
block. Study the following (incorrect) example:
BEGIN datatype = "Employee" <----
SET phone = $FLINE[ 1 10] | First
SET areacode = phone[1 3] | Level
BEGIN areacode = "514" <--- | Block?
SET local = "Y" | |
END | <----
SET tax = "Y" |
END <--- Second Level Block?
Parse-O-Matic does not pay attention to the indenting -- it is only a
tradition we use to make the file easier to read. The code will be
understood this way:
BEGIN datatype = "Employee" <---------------------
SET phone = $FLINE[ 1 10] | First
SET areacode = phone[1 3] | Level
BEGIN areacode = "514" <--- Second | Block
SET local = "Y" | Level |
END <--- Block |
SET tax = "Y" |
END <---------------------
You can nest BEGIN/END blocks up to 25 deep. (It is quite unlikely you
will ever actually need that much nesting) Here is an example of code
that uses nesting up to three deep:
BEGIN datatype = "Dog" <----------------------------------
SET breed = $FLINE[1 10] | First
BEGIN breed = "Collie" <----------------------- | Level
SET sound = "Woof" | Second | Block
BEGIN name = "Spot" <------ Third | Level |
SET attitude = "Friendly" | Level | Block |
END <------ Block | |
END <----------------------- |
BEGIN breed = "Other" <----------------------- Another |
SET sound = "Arf" | Second |
SET attitude = "Unknown" | Level |
END <----------------------- Block |
END <----------------------------------
Once again, the indentation is for clarity only and does not affect the
way the POM file runs. However, you will find that it makes your POM
file much easier to understand.
---------------------------
The OUT and OUTEND Commands
---------------------------
FORMAT: OUT[END] [value1 value2] |output-picture
The OUT command generates output without an end-of-line (i.e. carriage
return and linefeed characters). The OUTEND command generates output and
also adds an end-of-line.
When value1 matches value2 (or if the comparison is omitted), a line is
output to the output file, according to the output picture. Within the
output picture, all text is taken literally (i.e. " is taken to mean
literally that -- a quotation mark character).
The only exception to this is variable names, which are identified by the
{ and } characters. For example, a POM file that contained the following
single line:
OUTEND "X" = "X" |{$FLINE}
would simply output every line from the input file (not very useful!).
The "X" = "X" part of the command is the comparator which controls when
output occurs; if both parts of the comparator are both forced to the same
value, output will always occur.
NOTE: For efficiency, OUT does not write immediately to the output file; it
accumulates the output until it reaches 255 characters before writing. You
must do an OUTEND command to ensure that the data is actually written.
You can not use substrings after the "|" marker. Thus, the following line
is NOT legal:
OUTEND $FLINE[1 3] = "IBM" |{$FLINE[1 15]}
The correct way to code this is as follows:
SET CODE = $FLINE[1 15]
OUTEND $FLINE[1 3] = "IBM" |{CODE}
This would output the first 15 characters of any line that contains the
letters IBM in the first three positions.
-------------------------------
The OUTHDG and PAGELEN Commands
-------------------------------
FORMAT: OUTHDG value1
FORMAT: PAGELEN value1 [value2]
OUTHDG is used to place text headers in your output. For example, if you
were parsing data to create an employee report, you might use OUTHDG like
this:
SET EMPNUM = $FLINE[ 1 5]
SET NAME = $FLINE[10 28]
SET PHONE = $FLINE[30 45]
OUTHDG "EMPL# NAME PHONE NUMBER"
OUTHDG "----- ------------------- ------------"
OUTEND |{EMPNUM} {NAME} {PHONE}
The value following the OUTHDG command is sent to the output file only
once. That is to say, after an OUTHDG sends a value to the output file,
subsequent encounters with that OUTHDG command are ignored -- unless the
PAGELEN command is used.
The PAGELEN command specifies the length of the output page. Lines from
both OUTHDG and OUTEND are counted. The default value for page length is
zero, which means that the output is a single page of infinite length. As
such, OUTHDG headings appear only the first time they are encountered.
If you specify a page length greater than zero, OUTHDG headings become
re-enabled once the specified number of output lines have been generated.
A typical value is as follows:
PAGELEN "55"
This is an ideal page length for most laser printers. Dot matrix printers
typically use a page length of 66.
Parse-O-Matic inserts a "form feed" (ASCII 12) character between pages.
You can turn this off, however, by specifying the page length this way:
PAGELEN "66" "N"
The "N" specification means, "No, don't use form feeds". Another
acceptable value is "Y", meaning "Yes, use form feeds", but since this is
the default, you do not have to specify it.
------------------
The MINLEN Command
------------------
FORMAT: MINLEN value1
MINLEN specifies the minimum length a line must be to be considered for
parsing. If you omit the MINLEN command, the minimum length is assumed to
be 1. That is to say, all lines 1 character or longer will be processed
and shorter lines (null lines in other words) will be ignored.
MINLEN is useful for ignoring brief information lines that clutter up a
report that you are parsing. For example, in the sample file EXAMPLE2.POM,
the MINLEN command is set to 85 to ensure that all lines shorter than 85
characters long will be ignored. This simplifies the coding considerably.
The longest allowable input line is 255 characters, unless you use the
SPLIT or CHOP command (described later).
------------------
The IGNORE Command
------------------
FORMAT: IGNORE value1 value2
When value1 contains value2, the input line is ignored and all further
processing on the input line stops. The usual format of this command is as
in this example:
IGNORE $FLINE[3 9] = "Date"
This would skip any input line that contains the word "Date" between
columns 3 and 9 ($FLINE is the line just read from the input file).
------------------
The ACCEPT Command
------------------
FORMAT: ACCEPT value1 value2
The ACCEPT command accepts the input line if value1 contains value2. For
example, if the entire POM file read as follows:
ACCEPT $FLINE[15 17] = "YES"
OUTEND "X" = "X" |{$FLINE}
then any input line that contains "YES" starting in column 15 would be sent
to the output file. All other lines would be ignored.
CLUSTERED ACCEPTS: Sometimes you have to check more than one value to see
if the input line is valid. You do this by using "clustered ACCEPTs",
which are several ACCEPT commands in a row.
Briefly stated, if you have several ACCEPTs in a row ("clustered"), they
are all processed to determine if the input line is acceptable or not. If
even one ACCEPT matches up, the line is accepted. To express this in more
detail...
When value1 contains value2, the line is accepted, and processing of the
POM file continues for that input line, even if the immediately following
ACCEPTs do NOT produce a match. After all, we've already got a match!
If value1 does NOT contain value2, Parse-O-Matic looks at the next commmand
in the POM file. If it is not another ACCEPT, the input line is ignored.
If it is another ACCEPT, maybe it will product a match -- so Parse-O-Matic
moves to that command.
The following POM file uses clustered ACCEPTs to accept any line that
contains the name "FRED" or "MARY" between columns 5 and 8, or contains the
word "MEMBER" between columns 20 and 25.
SET NAME = $FLINE[5 8] (Set the variable)
ACCEPT NAME = "FRED" (Look for FRED)
ACCEPT NAME = "MARY" (Look for MARY)
ACCEPT $FLINE[20 25] = "MEMBER" (Look for MEMBER)
OUTEND "X" = "X" |{$FLINE} (Output the line if we get this far)
The following example would NOT work, however:
ACCEPT $FLINE[20 25] = "MEMBER"
SET NAME = $FLINE[5 8]
ACCEPT NAME = "FRED"
ACCEPT NAME = "MARY"
OUTEND "X" = "X" |{$FLINE}
It would not work because the ACCEPTs are not clustered; if the first
ACCEPT fails, the input line will be rejected as soon as the SET command is
encountered. The next two ACCEPTs would not be reached in such case.
----------------
The TRIM Command
----------------
FORMAT: TRIM var1 spec1 character
TRIM removes characters from var1. This is usually used to remove blanks.
spec1 can be: A=All B=Both ends L=Left side only R = Right side only
For example:
SET PRICE = $FLINE[20 26]
TRIM PRICE "A" ","
TRIM PRICE "L" "$"
This would remove all commas from the variable "PRICE", and remove the
leading dollar sign. Thus:
If the input contained the string: "$25,783"
The first TRIM would change it to: "$25783"
The second TRIM would change it to: "25783"
---------------
The PAD Command
---------------
FORMAT: PAD var1 spec1 character len
PAD makes var1 a specified length, padded with a specified character.
spec1 is "L", "R", or "C" (Left, Right or Center)
character is the character used to pad the string
len is the desired string length
For example, if the variable ABC is set to "1234" ...
PAD ABC "L" "0" "7" left-pads it 7 characters wide with zeros ("0001234")
PAD ABC "R" " " "5" right-pads it 5 characters wide with spaces ("1234 ")
PAD ABC "C" "*" "8" would center it, 8 wide, with asterisks ("**1234**")
If the length is less than the length of the string, it is unchanged. For
example, if you set variable XYZ to "PINNACLE", then
PAD XYZ "R" " " "3"
would leave the string as-is ("PINNACLE").
Thus, PAD can not be used to shorten a string. If it was your intention to
make XYZ 3 letters long, it would be appropriate to use the SET command:
SET XYZ = XYZ[1 3]
------------------
The INSERT Command
------------------
FORMAT: INSERT var1 spec1 value1
The INSERT command inserts text on the left or right of var1, or at a
"found text" position.
spec1 is "L" or "R" (Left or Right) or a find-string (e.g. "@HELLO")
value1 is the value to be inserted
For example, if the variable ABC is set to "Parse-O-Matic", then
INSERT ABC "L" "Register " sets ABC to "Register Parse-O-Matic"
INSERT ABC "R" " is super" sets set ABC to "Parse-O-Matic is super"
You can use a find-string to insert text at the first occurance of the text
you specify. For example:
INSERT ABC "@-O-Matic" "!" sets ABC to "Parse!-O-Matic"
If the find-string is not found, nothing is done.
------------------
The CHANGE Command
------------------
FORMAT: CHANGE var1 value1 value2
The CHANGE command replaces ALL occurances of value1 with value2. This is
more powerful than TRIM, but is not as efficient. Here is an example of
the CHANGE command in action:
SET DATE = $FLINE[31 38]
CHANGE DATE "/" "--"
If the SET command assigned DATE the value: "93/10/15"
Then the CHANGE command would convert it to: "93--10--15"
-----------------
The SPLIT Command
-----------------
FORMAT: SPLIT from-position to-position [,from-pos'n to-pos'n] [...]
The maximum length of an input line from a text file is 255 characters. If
your input file is wider than that, you must break up the file into
manageable chunks, using the SPLIT command. This command lets you specify
the way in which each input line is broken up so that it will look like
several SEPARATE lines.
For example, if your input lines were up to 300 characters wide, you could
specify:
SPLIT 1 255, 256 300
This would break up each line as if it was two lines. (If some of the
lines were less than 256 characters they would still be treated as two
lines, though the second line would be null (i.e. empty).)
You can specify up to 100 splits (use multiple SPLIT commands if
necessary). With SPLIT, Parse-O-Matic can handle input records of up to
32767 characters.
The best way of handling SPLIT or CHOPped files is to use a combination of
$SPLIT (explained in more detail later) and BEGIN/END. For example:
SPLIT 1 250, 251 300
BEGIN $SPLIT = "1"
SET a = $FLINE[ 1 10]
SET b = $FLINE[11 20]
END
BEGIN $SPLIT = "2"
SET x = $FLINE[ 1 10]
SET y = $FLINE[11 20]
OUTEND |{a} {b} {x} {y}
END
This would output the data which appears (in the input file) in columns
1-10, 11-20, 251-260 and 261-280.
----------------
The CHOP Command
----------------
FORMAT: CHOP from-position to-position [,from-pos'n to-pos'n] [...]
The CHOP command works the same way as the SPLIT command, with one
exception: it informs Parse-O-Matic that the input is a fixed-record-
length file. In other words, it means that the input records are
distinguished by having a particular (and exact) length, rather than being
separated by end-of-line characters (Carriage Return, Linefeed) as is the
case for a standard text file.
Thus, if you have an input file containing fixed-length records, each of
which is 200 characters wide, you could specify it like this:
CHOP 1 200
If the input record is more than 255 characters, you must break it up into
smaller chunks. For example, if the input record was 300 characters wide,
you could break it up like this:
CHOP 1 250, 251 300
By using CHOP, Parse-O-Matic can handle input records up to 32767
characters wide. You can use the $SPLIT variable to manage your use of
CHOP. See the example in the section describing the SPLIT command.
------------------
The LOOKUP Command
------------------
FORMAT: LOOKUP var1 value1
The LOOKUP command will search for value1 in a text file (the name of which
is specified either by the LOOKFILE command or the /L startup parameter).
When POM finds it, it sets var1 to another value found on the same line.
Let us suppose you created a text file, named NAMES.TBL, like this:
R. REAGAN Ronald Reagan
D. EISENHOWER Dwight Eisenhower
G. BUSH George Bush
: :
Column 1 Column 18
This file can be used to look up a name, as in this POM file:
LOOKFILE "NAMES.TBL"
LOOKCOLS "1" "17" "18" "34"
SET oldname = $FLINE[21 37]
TRIM oldname "R" " "
LOOKUP newname = oldname
OUTEND |{oldname} {newname}
The LOOKFILE command specifies the name of the look-up file. The LOOKCOLS
command specifies the starting and end columns for both the "text-to-look-
for" field (known as the key field) and the "text-to-replace-with" field
(known as the data field).
The LOOKUP command will look for oldname in NAMES.TBL. If oldname is set
to "G. BUSH", LOOKUP sets newname to "George Bush". If, however, oldname
is set to "G. WASHINGTON", which doesn't appear in NAMES.TBL, newname
is set to "" (that is to say, an empty string).
There is no limit to the number of lines that you can put in a look-up
file. However, the more lines there are, the longer it will take to
process (because there is more to search). The maximum length of a line
in a look-up file is 255 characters.
In the look-up file, null (empty) lines are ignored. You can also include
comments in the file by starting the line with a semi-colon:
; Some of the Presidents of the United States
R. REAGAN Ronald Reagan
D. EISENHOWER Dwight Eisenhower
G. BUSH George Bush
The LOOKUP command can be used for more than just names, of course. You
could use it to look up prices, phone numbers, addresses and so on.
--------------------
The LOOKFILE Command
--------------------
FORMAT: LOOKFILE value1
The LOOKFILE command specifies the name of the look-up file for the next
LOOKUP command. This lets you use several look-up files in one POM file.
For example:
SET name = $FLINE[1 20]
; Look up full name
LOOKFILE "NAMES.TBL"
LOOKCOLS "1" "25" "30" "50"
LOOKUP fullname = name
; Look up phone number
LOOKFILE "PHONE.TBL"
LOOKCOLS "1" "25" "30" "40"
LOOKUP phone = name
; Output result
OUTEND |{name} {fullname} {newname}
If you only have one look-up file, you may omit the LOOKFILE command and
specify the file name on the command line, using the /L parameter. For
example, you could write a POM file like this:
SET name = $FLINE[1 20]
; Look up full name
LOOKCOLS "1" "25" "30" "50"
LOOKUP fullname = name
; Output result
OUTEND |{name} {fullname}
Your POM command could then look like this:
POM MYPOM.POM INPUT.TXT OUTPUT.TXT /LC:\MYFILES\NAMES.TBL
This technique allows you to use several different look-up files with the
same POM file, simply by changing the command line.
--------------------
The LOOKCOLS Command
--------------------
FORMAT: LOOKCOLS value1 value2 value3 value4
The LOOKCOLS command specifies the starting and ending columns for the
key and data fields in a look-up file (see the explanation of the LOOKUP
command for an overview of look-up files).
value1 specifies the starting column for the key field
value2 specified the ending column for the key field
value3 specifies the starting column for the data field
value4 specified the ending column for the data field
You can specify a null value to indicate "same as last time". For example:
SET name = $FLINE[1 20]
LOOKFILE "NAMES.TBL"
LOOKCOLS "1" "25" "30" "50"
LOOKUP fullname = name
LOOKFILE "PHONE.TBL"
LOOKCOLS "" "" "" "40"
LOOKUP phonenum = name
OUTEND |{name} {fullname} {phonenum}
The second LOOKCOLS command uses the same numbers for the first three
values that the first LOOKCOLS command used.
If you do not specify a LOOKCOLS command, the default values are:
Key Field: Starting column = 1
Ending column = 10
Data Field: Starting column = 12
Ending column = 255
This is equivalent to LOOKCOLS "1" "10" "12" "255".
--------------------
The LOOKSPEC Command
--------------------
FORMAT: LOOKSPEC value1 value2 value3
The LOOKSPEC command configures the way the next LOOKUP command will work.
value1 = Trim ("Y" or "N" -- default "Y")
value2 = Sorted ("Y" or "N" -- default "N")
value3 = Case-sensitive ("Y" or "N" -- default "N")
The Trim setting specifies whether or not the data field should have spaces
stripped off both ends.
The Sorted setting specifies whether or not the look-up file is sorted by
the key field. A sorted file is much faster than an unsorted file. This
is especially noticeable if you have a large look-up file and a lot of
input to process.
The Case-sensitive setting specifies whether or not LOOKUP should distin-
guish between upper and lower case when searching. The default setting is
"N" (No), so that LOOKUP would find "John Smith", even if it appeared in
the look-up file as "JOHN SMITH". It is usually safest to set Case-
sensitivity to "N", but if you set it to "Y", searching is slightly faster.
You can specify a null value to indicate "same as last time". For example:
SET name = $FLINE[1 20]
LOOKFILE "DATA.TBL"
LOOKCOLS "1" "25" "30" "50"
LOOKSPEC "Y" "Y" "Y"
LOOKUP fullname = name
LOOKCOLS "" "" "60" "70"
LOOKSPEC "N" "" ""
LOOKUP phonenum = name
OUTEND |{name} {fullname} {phonenum}
The second LOOKSPEC command uses the same settings for Sorted and Case-
sensitivity as the first one, but specifies a different Trim setting.
-----------------
The TRACE Command
-----------------
FORMAT: TRACE var1
The TRACE command is an alternative to standard tracing (see "Tracing", in
the "Terms and Techniques" section).
When you include a TRACE command in your POM file, Parse-O-Matic will
create a text file, named POM.TRC, and use it to keep a detailed record of
POM's processing. Here is an example of the TRACE command:
TRACE PRICE
This traces the variable named "PRICE". After processing, the file POM.TRC
will show everything that happened, and give the value of PRICE at the
TRACE line.
NOTE: Since trace files are so detailed, they can be very large. If you
are trying to debug a POM file using TRACE, it is a good idea to use a
small input file.
===========================================================================
TERMS AND TECHNIQUES
===========================================================================
------
VALUES
------
A value can be specified in the following ways:
"text" A literal text string
#number A single ASCII character (e.g. #32 = Space)
#number#number... Several ASCII characters (e.g. #32#32 = 2 Spaces)
VARNAME The name of a variable
VARNAME[start end] A substring of a variable
VARNAME[start] A single character
VARNAME+ Incremented variable (see explanation below)
Variable names can be up to 8 characters long. There is no distinction
between upper and lower case in the variable name. You can create up to
220 variables and literals.
The # character is used to specify a literal text string of one or more
characters. Follow each # with the decimal value of the ASCII character
you want. Here are some useful values:
#10 = Line Feed #12 = Form Feed #13 = Carriage Return
Parse-O-Matic predefines several variables. They are:
$FLINE = The line just read from the file (max. length 255 characters)
$FLUPC = The line just read from the file, in uppercase
$BRL = The { character (used in OUT)
$BRR = The } character (used in OUT)
$TAB = The tab character (Hex $09; ASCII 09)
$SPLIT = The CHOP or SPLIT number you are currently processing
Since $FLINE has a maximum length of 255 characters, you will have to use
the SPLIT or CHOP command if your input file is wider than that. The
$SPLIT variable reports which segment you are processing. For example,
if you had this command...
CHOP 1 255, 256 380
then $SPLIT would be set to "1" when it was processing columns 1 to 255,
and it would be set to "2" when it was processing columns 256 to 380.
----------
DELIMITERS
----------
If you need to specify a quotation mark, use "". For example:
IGNORE $FLINE = "He said ""Hello"" to me."
This would ignore lines containing: He said "Hello" to me.
------------------
ILLEGAL CHARACTERS
------------------
No command can contain these ASCII characters:
HEX DECIMAL NAME
--- ------- --------------------
$00 0 NULL
$0A 10 LF (Linefeed)
$0D 13 CR (Carriage Return)
Of course, LF and CR do appear at the end of each line, in a text file.
------------
INCREMENTING
------------
Only numeric incrementing is supported. Attempting to increment another
type of variable will result in an error.
- Incrementing "1" gives you "2"
- Incrementing "9" gives you "10"
The first time a variable is referenced, it has a null value. If you
increment this, it will be changed from "" (i.e. null) to "1".
-------------
LINE COUNTERS
-------------
If your input record is divided over several lines (due to its original
format or perhaps because you used the SPLIT or CHOP command), it is
helpful to set up a line counter. The following example would extract the
first six characters of the second line of input records that span three
lines (designated lines 0, 1 & 2):
IF LineCntr = "1" THEN MyField = $FLINE[1 6]
OUTEND LineCntr = "1" |{MyField}
IF LineCntr = "2" THEN LineCntr = "" ELSE LineCntr+
-------
TRACING
-------
By setting the DOS variable POM to ALL, you can generate a trace file,
named POM.TRC. This is helpful if you have trouble understanding why your
file isn't being parsed properly. But be sure to test it with a SMALL
input file; the trace is quite detailed, and it can easily generate a huge
output file.
To save space, you can specify a particular list of variables to be traced,
rather than tracing everything. For example, to trace only the variable
PRICE, enter this DOS command:
SET POM=PRICE
To trace several variables, separate the variable names by slashes, as in
this example:
SET POM=PRICE/BONUS/NAME
This would trace the three variables PRICE, BONUS and NAME.
----------
QUIET MODE
----------
Sometimes you don't want the user to see the Parse-O-Matic processing
screen. In such cases, you can use the "Quiet Mode" switch (/Q) on the
command line. For example:
POM XYZ.POM MYFILE.TXT TEMP.TXT /Q
The /Q switch suppresses the display of the processing screen. The only
time a user will see anything is if there is a problem (for example: the
input file was not found).
---------
DBF FILES
---------
If Parse-O-Matic notices that the input file is a DBase file (i.e. it has a
DBF extension -- for example: MYFILE.DBF), it will process the data
somewhat differently. For instance, the variable $FLINE is not defined.
Rather, each of the fields in the database are pre-parsed. Thus, if you
have a DBF file containing three fields (EMPNUM, NAME, PHONE), your entire
POM file might look like this:
IGNORE DELETED "Y"
OUTEND |{EMPNUM} {NAME} {PHONE}
The DELETED variable is created automatically for each record. If it is
set to "Y", it means the record has been deleted from the database and is
probably not valid. In most cases, you will want to ignore such records.
If you do not know what the field names are, you can obtain the list with
the following POM file:
TRACE DELETED
Afterwards, when you inspect the trace file (POM.TRC), you will see a
summary of all the fields. Since there are no output commands (e.g. OUTEND
and OUTHDG), the output file will be empty.
--------------------------------
CONVERTING COMMA-DELIMITED FILES
--------------------------------
It is possible (but rather difficult) to create a POM file that will
convert comma-delimited files to columnar format. Fortunately, Pinnacle
Software has a useful utility (named CCDF) which can do this very easily.
When you register Parse-O-Matic, we will send you an evaluation copy of
CCDF. If you are in a hurry, you can download a copy from our free files
BBS at 514-345-8654. (Sign on as GUEST -- no password needed -- and enter
the command GET CCDF to start downloading the file CCDF.ZIP. The default
download protocol is ZMODEM, but you can change this with the PROTOCOL
command.)
--------
EXAMPLES
--------
Most of these techniques are demonstrated by the examples provided with the
standard Parse-O-Matic package. To see these examples, switch to your
Parse-O-Matic directory and type START at the DOS prompt.